# GQA Efficient Inference
Llama 3.1 Minitron 4B Width Base
Other
Llama-3.1-Minitron-4B-Width-Base is a foundational text-to-text model obtained by pruning Llama-3.1-8B, suitable for various natural language generation tasks.
Large Language Model
Transformers English

L
nvidia
10.15k
190
Minitron 8B Base
Other
Minitron-8B-Base is a large language model obtained by pruning Nemotron-4 15B, employing distillation and continuous training methods, saving 40 times the training tokens and 1.8 times the computational cost compared to training from scratch.
Large Language Model
Transformers English

M
nvidia
5,725
66
Llama 3.1 8B
Meta Llama 3.1 is a series of multilingual large language models, including 8B, 70B, and 405B pre-trained and instruction-tuned generative models, optimized for multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
1.0M
1,583
Meta Llama 3 70B
Meta's Llama 3 series of large language models, including 8B and 70B scale pre-trained and instruction-tuned generative text models, optimized for dialogue scenarios, with excellent performance in industry benchmark tests.
Large Language Model
Transformers English

M
meta-llama
15.32k
857
Featured Recommended AI Models